Typographical and Orthographical Spelling Error Correction
نویسندگان
چکیده
This paper focuses on selection techniques for best correction of misspelt words at the lexical level. Spelling errors are introduced by either cognitive or typographical mistakes. A robust spelling correction algorithm is needed to cover both cognitive and typographical errors. For the most effective spelling correction system, various strategies are considered in this paper: ranking heuristics, correction algorithms, and correction priority strategies for the best selection. The strategies also take account of error types, syntactic information, word frequency statistics, and character distance. The findings show that it is very hard to generalise the spelling correction strategy for various types of data sets such as typographical, orthographical, and scanning errors.
منابع مشابه
Triphone Analysis: A Combined Method For The Correction Of Orthographical And Typographical Errors
Most existing systems for the correction of word level errors are oriented toward either typographical or orthographical errors. Triphone analysis is a new correction strategy which combines phonemic transcription with trigram analysis. It corrects both kinds of errors (also in combination) and is superior for orthographical errors.
متن کاملImproving the Recognition Accuracy of Text Recognition Systems Using Typographical Constraints
Spelling correction techniques can be used to improve the recognition accuracy of text recognition systems. In this paper a new spelling-error model is proposed that is especially suited to the correction of recognition errors occurring during the recognition of printed documents. An implementation of this model is described that exploits typographical constraints derived from character shapes....
متن کاملJoint English Spelling Error Correction and POS Tagging for Language Learners Writing
We propose an approach to correcting spelling errors and assigning part-of-speech (POS) tags simultaneously for sentences written by learners of English as a second language (ESL). In ESL writing, there are several types of errors such as preposition, determiner, verb, noun, and spelling errors. Spelling errors often interfere with POS tagging and syntactic parsing, which makes other error dete...
متن کاملTypographical Nearest-Neighbor Search in a Finite-State Lexicon and Its Application to Spelling Correction
A method of error-tolerant lookup in a finite-state lexicon is described, as well as its application to automatic spelling correction. We compare our method to the algorithm by K. Oflazer [14]. While Oflazer’s algorithm searches for all possible corrections of a misspelled word that are within a given similarity threshold, our approach is to retain only the most similar corrections (nearest nei...
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کامل